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Abstract — In this work, we obtain sufficient conditions for the 
"stability" of our recently proposed algorithms, Least Squares 
Compressive Sensing residual (LS-CS) and modified-CS, for 
recursively reconstructing sparse signal sequences from noisy 
measurements. By "stability" we mean that the number of misses 
from the current support estimate and the number of extras 
in it remain bounded by a time-invariant value at all times. 
We show that, for a signal model with fixed signal power and 
support set size; support set changes allowed at every time; and 
gradual coefficient magnitude increase/decrease, "stability" holds 
under mild assumptions - bounded noise, high enough minimum 
nonzero coefficient magnitude increase rate, and large enough 
number of measurements at every time. A direct corollary is that 
the reconstruction error is also bounded by a time-invariant value 
at all times. If the support of the sparse signal sequence changes 
slowly over time, our results hold under weaker assumptions than 
what simple compressive sensing (CS) needs for the same error 
bound. Also, our support error bounds are small compared to 
the support size. Our discussion is backed up by Monte Carlo 
simulation based comparisons. 



I. Introduction 

The static sparse reconstruction problem has been studied 
for a while (|2), (3), |4). The recent papers on compressive 
sensing (CS) 0, @, Q, 0, (9), (TO) (and many other 
more recent works) provide the missing theoretical guaran- 
tees - conditions for exact recovery and error bounds when 
exact recovery is not possible. But for recovering a time 
sequence of sparse signals, with time-varying sparsity patterns, 
most existing approaches are batch methods, e.g. ifTTl . JT2]- 
Our recent work on Least Squares CS-residual (LS-CS) and 
Kalman filtered CS-residual (KF-CS) El, [H, and later on 
modified-CS fl31 . fl6l . first studied the problem of recursively 
recovering a time sequence of sparse signals, with time- 
varying sparsity patterns, using much fewer measurements 
than what simple CS (CS done at each time separately) 
needs. By "recursive" reconstruction, we mean that we want 
to use only the current measurements' vector and the previous 
reconstructed signal to reconstruct the current signal. The 
storage and computational complexity of these solutions is 
only as much as that of simple CS, but their reconstruction 
performance is significantly better. LS-CS and modified-CS 
only use the assumption that the sparsity pattern (support in 
the sparsity basis) changes slowly over time. As we show in 
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A part of this work was presented at Allerton 2010 JTJ. 



Fig.Q]and in |fl6l , this is a valid assumption for many medical 
image sequences. KF-CS also uses slow signal value change. 

Denote the support estimate from the previous time by T. 
Modified-CS tries to find a signal that is sparsest outside of T 
among all signals that satisfy the data consnaint. It was first 
introduced in IfTSl , |[T6l , where we studied the noise-free case 
and obtained exact recovery conditions for it. LS-CS uses a 
different approach. It replaces CS on the observation by CS 
on the least squares (LS) residual computed by assuming that 
T is the correct support iTPTl . lH4l . In this work, we obtain 
the conditions required for "stability" of LS-CS, modified- 
CS and of an improved version of modified-CS which we 
call "modified-CS with add-LS-del" (improves the support 
estimation step of modified-CS). By "stability" we mean that 
the number of misses from the current support estimate and 
the number of extras in it remain bounded by a time-invariant 
value at all times. A direct corollary is that the reconstruction 
errors are also bounded by a time-invariant value at all times. 

A. Related Work 

LS-CS and modified-CS are causal and recursive approaches 
that only rely on the slow support change assumption. Another 
causal and recursive approach, that uses approximate belief 
propagation, has been proposed in very recent work Wf\ . This 
is a fully Bayesian approach that assumes prior probabilistic 
models on both slow support and slow signal value change. 
Some very interesting numerical experiments are shown. 

"Recursive sparse reconstruction" also sometimes refers to 
homotopy methods, e.g. Ifl8l . |[T9l , whose goal is to use the 
past reconstructions and homotopy to speed up the current 
optimization, but not to achieve accurate recovery from fewer 
measurements (than what simple CS needs). Algorithms that 
improve the reconstruction of a single signal recursively as 
more measurements come in, such as those in ||20) , lETl . lH9l , 
are also sometimes referred to as "recursive sparse recovery" 
algorithms. Clearly, the goals in the above works are quite 
different from ours. 

Also, causal but batch algorithms for recovering sparse 
signal sequences, with time-invariant support, from fewer 
measurements were proposed in ||22l . 

Other related ideas in literature include the following. Two 
approaches related to modified-CS are ||23l and weighted l\ 
l24l . But both of these focus only on static sparse recovery 
with prior support knowledge. The work of [24] obtains exact 
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Fig. 1. Slow support change in medical image sequences. 

The two-level Daubechies-4 2D discrete wavelet transform (DWT) 
served as the sparsity basis. Since real image sequences are only 
approximately sparse, we use Nt to denote the 99%-energy support 
of the DWT of these sequences. The support size, \N t \, was 6-7% of 
the image size for both sequences. We plot the number of additions 
(left) and the number of removals (right) as a fraction of \Nt\. Notice 
that all changes are less than 2% of the support size. 



recovery thresholds for weighted £1, similar to those in Q, 
for the case when a probabilistic prior on the signal support 
is available. Iterative support estimation approaches (using the 
recovered support from the first iteration for a second weighted 
t\ step and doing this iteratively) have been studied in recent 
work 11251 . l26l . 11271 . This is done for iteratively improving 
the recovery of a single signal. 

To the best of our knowledge, stability over time has not 
been studied in the above works for recursive sparse recovery, 
except in J28) (KF-CS and LS-CS) or 04] (LS-CS). Our result 
from 1281 is under strong assumptions, e.g. it is for a random 
walk signal change model (which has unbounded signal power 
and hence is the easier but unrealistic case), and it requires 
strong assumptions on the measurement matrix. Our result 
for LS-CS stability from lfl4"l holds under mild assumptions 
and is for a fairly realistic signal change model. The only 
limitation is that it assumes that support changes occur "every- 
so-often" (every d time units, there are S a support additions 
and removals). But from testing the slow support change 
assumption for real data (medical image sequences), it has 
been observed that support changes usually occur at every 
time, e.g. see Fig. Q] This important case is the focus of the 
current work. Moreover, in [1141 . we only studied LS-CS. In 
this work we study both LS-CS and modified-CS and also 
modified-CS with add-LS-del. 



B. Paper Organization 

The paper is organized as follows. We give the problem 
definition in Sec. III-AI and we overview our results in Sec. 
III-BI We describe the signal model that we assume for 
proving stability in Sec. Hill In Sec. [IV] we obtain sufficient 
conditions for the stability of modified-CS and discuss the 
implications of the result as well as its limitations. In Sec. [V] 
we introduce modified-CS with add-LS-del to address some of 
the limitations of modified-CS and obtain its stability result. 
The stability result for modified-CS with add-LS-del is more 
difficult to obtain because of its improved support estimation 
procedure. But, in the end the result is also stronger. The result 
for LS-CS stability is obtained in Sec. |Vl]and compared with 
previous results. Simulation experiments are discussed in Sec. 
IVIII Conclusions are given in Sec. IVIIII The results' overview 
of Sec. III-BI and some discussions in the later sections can be 
shortened after review if needed, to make the paper compact. 



II. Notation, Problem Definition and Overview of 
Results 

We define notation and give the problem formulation in Sec. 
III-AI We give a brief overview of our results in Sec. III-BI 

A. Notation and Problem Definition 

We let [l,m] := [1,2, ...m]. We use T c to denote the 
complement of a set T w.r.t. [l,m], i.e. T c := {i e [l,m] : 
i g T}. We use |T| to denote the cardinality of T. Also, 
denotes the empty set. The set operations U, n, \ have their 
usual meanings (recall that A \ B :~ AO B c ). 

For a vector, v, and a set, T, vt denotes the |T| length 
sub-vector containing the elements of v corresponding to the 
indices in the set T. \\v\\k denotes the norm of a vector v. 
If just \\v\\ is used, it refers to \\v\\ 2 . Similarly, for a matrix 
M, ||M||fc denotes its induced fc-norm, while just ||M|| refers 
to || M || 2- M' denotes the transpose of M and M' denotes the 
Moore-Penrose pseudo-inverse of M (when M is tall, Aft := 
(M'M) _1 M'). Also, Mt denotes the sub-matrix obtained by 
extracting the columns of M corresponding to indices in T. 

At all times, t > 0, we assume the following observation 
model: 



y t = Ax t + w t , \\w t \\ < e 



(1) 



where xt is an m length sparse vector with support N t ; yt is 
the n < m length observation vector at time t; and w t is the 
observation noise. As we explain later, our algorithms need 
more measurements at the initial time, t — 0. We use uq to 
denote the number of measurements used at t — and we use 
Ao to denote the corresponding no x m measurement matrix, 
i.e. at t = 0, we have 



y a = A x + w , || ^0 II < e 



(2) 



The term "support", as usual, refers to the set of indices of 
the nonzero elements of x t . 

Our goal is to recursively estimate x t using y±, . . .y t . By 
recursively, we mean, use only y t and the estimate from t — 1, 
Xt-i, to compute the estimate at t. 

The 5-restricted isometry constant (RIC) B, 5s, for the 
matrix, A, is the smallest real number satisfying 



(l-<5 s )|| c || 2 <p TC || 2 <(l + ( 5 s )|| c || 2 



(3) 



for all sets T C [l,m] of cardinality |T| < S and all real 
vectors c of length |T|. The restricted orthogonality constant 
(ROC) ID, 6*5 , 1 ,s 2 , is the smallest real number satisfying 



\a' A Tl ' A T2 c 2 \ < Sl ,s 2 || Cl || Heal 



(4) 



for all disjoint sets T t ,T 2 C [l,m] with |Ti| < S u \T 2 \ < S 2 
and Si + S 2 < m, and for all vectors ci, c 2 of length |Ti|, 
|T 2 | respectively. 

In this work, 5s, 6si.S 2 always refer to the RIC, ROC for 
the measurement matrix A which is used at t > 0. If we refer 
to the RIC of any other matrix, e.g. Aq, we use 5s{Aq). 

We use a to denote the support estimation threshold used 
by modified-CS and we use a a dd, «dei to denote the support 
addition and deletion thresholds used by modified-CS with 
add-LS-del and by LS-CS. We use Nt to denote the support 
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estimate at time t. To keep notation simple, we avoid using 
the subscript t wherever possible. 

Definition 1 (T t , A t , A e j): We use T t := N t -i to denote 
the support estimate from the previous time. This serves as the 
predicted support at time t. We use A t := N t \T t to denote 
the unknown part of T t and A et :— T t \ N t to denote the 
"erroneous" part of T t . In many places in the manuscript, we 
remove the subscript t to keep notation simple. 
With the above definition, clearly, 

JV f =T t UA t \A e , f . 

Definition 2 (f t , A t , A M J: We use f t := N t to denote the 
final estimate of the current support; A t := N t \T t to denote 
the "misses" in N t and A e , t := T t \N t to denote the "extras". 

We sometimes refer to A, A e as the predicted support errors 
and to A, A e as the final (or estimated) support errors. The 
sets T ac jd, A a dd, A e a( jd are defined in Definition l4l (Sec. fVT>. 

If two sets B, C are disjoint, we just write DL)B\C instead 
of writing (D U B) \ C, e.g. N t = T U A \ A e . 

We refer to the left (right) hand side of an equation or 
inequality as LHS (RHS). 

In this work, "modified-CS" refers to the solution of ((9). 
Also, simple CS refers to the solution of © with T = 0. 

B. Overview of Results 

When measurements are noisy, the reconstruction errors of 
modified-CS and of LS-CS can easily be bounded as a function 
of the support size, |AT t |, and of the predicted support error 
sizes, |A t | and |A e t | fl29l , fl4l . The bound is small at time 
t if |A t | and \A e t \ are small enough. But smallness of the 
predicted support errors, A t , A et , depends on the accuracy 
of the previous reconstruction, and thus, in general, it may 
happen that, over time, the error bound keeps increasing. Such 
a result is of limited use for a recursive reconstruction problem. 
There is thus a need to obtain conditions under which one can 
show "stability", i.e. ensure that a time-invariant bound holds 
on the sizes of these support errors. Combining this with the 
error bound result will imply that the reconstruction error is 
also bounded by a time-invariant value at all times. 

In this work, we obtain results for the stability of three 
algorithms: (a) modified-CS; (b) "modified-CS with add-LS- 
del" and (c) LS-CS. "Modified-CS with add-LS-del" improves 
the support estimation step of modified-CS by using a three 
step approach first introduced in lfT3l . Ifi4l and in 11301 . 
QTI - support addition with a smaller threshold, followed 
by LS estimation on the new support, and finally support 
deletion using the LS estimate. Using add-LS-del significantly 
improves both the stability result we can prove (as argued in 
Sec. IV-Bb and the empirical reconstruction performance we 
get (see Sec. IVIIb . 

All our results are obtained under a bounded observation 
noise assumption and for a signal model with 

1) support changes (S a additions and S a removals) occur- 
ring at every time, t, 

2) magnitude of the newly added coefficients increasing 
gradually, and similarly for decrease before removal, 



3) support size, |iV t | = Sq at all times and the signal 
poweiQ, ||a;t|| 2 , also constant at all times. 

Our results have the following form. For a given number and 
type of measurements (i.e. for a given measurement matrix, 
A), and for a given noise bound, e, if, 

1) the support estimation threshold(s) is/are appropriately 
set, 

2) the support size, Sq, and the newly added (or removed) 
support size, S a , are small enough, 

3) the newly added coefficients' increase rate (existing 
large coefficients' decrease rate), r, is large enough, and 

4) the initial number of measurements, no, is large enough 
for accurate initial reconstruction using simple CS, 

then, the support error sizes are bounded by time-invariant 
values: we show that |A t | < 2S a , |A e ,t| = and |A t | < 2S a , 
|A e ,t| < S a . A direct corollary is that the reconstruction error 
is also bounded by a time-invariant value at all times. 

Remark 1: The reason we need to assume bounded noise 
is as follows. When the noise is unbounded, e.g. Gaussian, 
all error bounds for CS and, similarly, all error bounds for 
LS-CS or modified-CS hold with "large probability", e.g. see 
11321 . Ifl4l . For stability, we need the error bound for LS-CS 
or modified-CS to hold at all times, < t < oo (this, in turn, 
is used to ensure that the support gets estimated with bounded 
error at all times). Clearly, this will be a zero probability event. 
As an aside, most existing works which use the RIC based 
approach of Candes et al to bound the error of noisy sparse 
recovery, or of noisy sparse recovery with partial support 
knowledge, also assume bounded noise, e.g. |9j, lllOl . 11291 . 

Remark 2: We should mention that constant or bounded 
signal power is both the more practical case (since, in practice, 
signal power never keeps increasing unboundedly) and is also 
the more difficult case. This is because the accuracy of the 
reconstruction at time t + 1 relies heavily on the correct 
detection of the small elements at time t. Correct detection 
will become easier for larger signal power (or, to be precise, 
for larger power of the smallest nonzero coefficients). 

For our signal model, slow support change translates to 
S a <C 5*o- Under this assumption, clearly, 2S a <C So, and so 
our support error bounds are small compared to the support 
size, Sq, making our stability results meaningful. We can argue 
that our results hold under weaker assumptions (allow larger 
values of Sq), for a given measurement matrix A, than the 
corresponding simple CS (CS done at each time separately) 
result. Since simple CS is not a recursive approach, the CS 
error bound from iffOl (or other works) also serves as a 
stability result for it. Also, we can argue that modified-CS 
with add-LS-del needs the weakest conditions on the number 
of measurements, n, and on the rate of coefficient magnitude 
increase/decrease, r. Modified-CS needs similar conditions on 
n, but needs a larger r. LS-CS needs the strongest conditions 
on both n and r. Since we can only compare sufficient 
conditions or upper bounds, we back up all our discussion 
with simulation experiments to compare actual reconstruction 

'Usually signal power refers to the expected value of the 2-norm of the 
nal, E[||a: t || 2 ]. In our work, we assume a deterministic signal model and 
hence signal power just refers to ||a;t|| 2 - 
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Fig. 2. An example of Signal Model [T] with m = 100, So = 12, 
S a = 1, and d = 4. Thus at any time it contains 2S a = 2 elements 
each with magnitude r, 2r, and 3r and So — (2d — 2)S a = 6 elements 
with stable magnitude M — 4r. We show each support element's 
magnitude inside a square box and its index just above the box. 
The up and down arrows below the Nt-i box indicate whether the 
element increases or decreases. An "=" indicates that the element 
magnitude remains constant at 4r. In both N t -i and N t we have 
circled the small elements' set iSt-i(3) and <St(3) respectively. 

performance. 

III. Signal Model for Studying Stability 

The modified-CS or LS-CS algorithms do not assume any 
signal model. But for showing stability, we need certain 
assumptions on the signal change over time. 

Signal Model 1: Assume the following. 

1) (addition) At each t > 0, S a new coefficients get added 
to the support at magnitude r. Denote this set by At- 

2) (increase) At each t > 0, the magnitude of S a coeffi- 
cients out of all those which had magnitude (j — l)r at 
t— 1 increases to jr. This occurs for all 2 < j < d. Thus 
the maximum magnitude reached by any coefficient is 
M := dr. 

3) (decrease) At each t > 0, the magnitude of S a coeffi- 
cients out of all those which had magnitude (j + l)r at 
t—1 decreases to jr. This occurs for all 1 < j < (d— 2). 

4) (removal) At each t > 0, S a coefficients out of all those 
which had magnitude r at t — 1 get removed from the 
support (magnitude becomes zero). Denote this set by 

n t . 

5) (initial time) At t = 0, the support size is So 
and it contains 2S a elements each with magnitude 
r, 2r,...(d- l)r, and (So - (2d - 2)S a ) elements with 
magnitude M . 

We show an example of the above signal model in Fig. [2] 
The above model has the following realistic features - (a) 
equal number, S a , of additions and removals to (from) the 
support occur at every time, t; (b) a newly added coefficient 
gets added at a small magnitude; (c) magnitude of any nonzero 
element either remains constant, or increases gradually at 
rate r, but not beyond a maximum magnitude M := dr, or 
decreases gradually at rate r; and (d) at all times, the signals 
have the same support set size, |iV f | = So and the same signal 
power, |Mj 2 = (S - (2d - 2)5 a )M 2 + 2S a Y*l\ fr 2 

In practice, the number of additions/removals to the support 
is never exactly equal, but varies in a small range over time. A 
similar thing holds for the coefficient increase/decrease rate, r, 
or for the stable magnitude, M. But for notational simplicity, 



we ignore these variations 0. Also, in practice, different 
nonzero elements may have different magnitude increase rates, 
Ti, and different stable magnitudes, Mi. It will be possible to 
extend our results to this latter case fairly easily, and we expect 
that the result will require a lower bound on minj rj. 

Signal Model Q] does not specify a particular generative 
model. Two examples of signal models that satisfy the above 
assumptions are given in Appendix [A] Briefly, in the first 
model, at each t, S a new elements, randomly selected from 
N t -\ c , get added to the support at initial magnitude, r, 
and equally likely sign. Their magnitude keeps increasing 
gradually, at rate r, for d time units after which it becomes 
constant at M := dr. The sign does not change. Also, at each 
time, t, S a randomly selected elements out of the "stable" 
elements' set (set of elements which have magnitude M at 
t — 1), begin to decrease at rate r and this continues until 
their magnitude becomes zero, i.e. they get removed from the 
support. A second possible generative model randomly selects 
S a out of the 2S a current elements with magnitude jr and 
increases them, and decreases the other S a elements. 

To understand the implications of the assumptions in Signal 
Model Q] we define the following sets. 

Definition 3: Define the following. 

1) For all <j < d- 1, let 

v t(j) ■= {i ■ \x t ,i\ = jr, \x t -i,i\ = (j + l)r} 

denote the set of elements that decrease from (j + l)r 
to jr at time, t. 

2) For all 1 < j < d, let 

x t(j) ■■= {i ■ \x t ,i\ = jr, \x t -x,i\ = (j - l)r} 

denote the set of elements that increase from (j — l)r 
to jr at time, t. 

3) For all 1 < j < d - 1, let 

S t (j) := {* : < \x tii \ < jr} 

denote the set of small but nonzero elements, with 
smallness threshold jr. 

4) Clearly, 

a) The newly added set, 

A t =l t (l) 

b) The newly removed set, 

n t = v t (o) 

C) \l t (j)\ = S a , \V t (j)\ = S a , \S t (j)\ - 2( 3 - l)S a . 

Consider a 1 < j < d. From Signal Model [TJ it is clear 
that at any time, t, S a elements enter the small elements' set, 
St(j), from the bottom (set At) and S a enter from the top 
(set T> t (j — 1)). Similarly S a elements leave St(j) from the 
bottom (set TZt) and S a from the top (set I t (j)). Thus, 

S t (j) = S t _i(j) U (At U V t (j - 1)) \ (TZt U l t (j)) (5) 

2 To model the variations over time compactly, a probabilistic signal change 
model will be a better one. But that will make our analysis a lot more tricky 
since the reconstruction error bounds, which fo rm t h e st a rting point for our 
stability results, do not assume any randomness 1101 . 1141 . (29]. In particular, 
they do not treat the sparse signal as a random variable. 
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To look at an example, see Fig. [2] Consider j = 3. Notice 
that 5 t _i(3) = {2,91,12,74} and 5 t (3) = {79,12,2,66}. 
Also, At = {79}, Kt = {91}, I t (3) = {74} and 2? t (2) = 
{66}. Clearly {2, 91, 12, 74}U ({79}U{66})\ ({91}U{74} = 
{79, 12,2,66}, i.e. © holds. 

Since At,lZt,T> t (j — l),X t (j) are mutually disjoint, IZt C 
St-i(j) and X t (j) C St-i(j), thus, <(5j implies that 

5 t _i(j) Ui(\Kt = 5 t (j) U X t (i) \ X> t (j - 1) (6) 

Also, clearly, from Signal Model QJ 

Nt = N t -iUAt\Kt (7) 

We will use these in the proof of the results of Sec. [V] 

IV. Stability of modified-CS 

Modified-CS was first proposed in lfT31 . lfl6) as a solution to 
the problem of sparse reconstruction with partial, and possibly 
erroneous, knowledge of the support. Denote this "known" 
support by T. Modified-CS tries to find a signal that is sparsest 
outside of the set T among all signals satisfying the data 
constraint. In the noisy case, it solves min^ ||(/3)t<=||i s -t- \\yt~ 
< e - F° r recursively reconstructing a time sequence 
of sparse signals, we use the support estimate from the 
previous time, Nt-x, as the set T. The support is estimated 
by thresholding the output of modified-CS. At the initial time, 
t = 0, we let T be the empty set, 0, i.e. we do simple CS. 
Alternatively, as explained in lfT6l . we can use prior knowledge 
of the initial signal's support as the set T at t = 0, e.g. for 
wavelet sparse images with no (or a small) black background, 
the set of indices of the approximation coefficients can form 
the set T. This prior knowledge is usually not as accurate. 
Thus, in either case, at t = we need more measurements, 
i.e. no > n. 

In this work, for simplicity, we assume that simple CS is 
done at t = 0. We summarize the algorithm in Algorithm QJ 

Algorithm 1 Modified-CS 

For t > 0, do 

1) Simple CS. If t = 0, set T = and compute x t , m odcs 
as the solution of 

mjn||G0)||i s.t. \\yo - AoP\\ <e (8) 

2) Modified-CS. If t > 0, set T = N t -i and compute 
Xt,modcs as the solution of 

min||C3)H|i s.t. \\yt-Ap\\ < e (9) 

P 

3) Estimate the Support. Compute T as 

f = {ie [l,m] : | (i t , modes); I > a} (10) 

4) Set N t = f. Output x t , mo dcs- Feedback N t . 

By adapting the approach of [10], the error of modified-CS 
can be bounded as a function of \T\ = \N\ + |A e | — |A| and 
|A|. This was done in l29l . We state a modified version here. 

Lemma 1 (modified-CS error bound): Let i be a sparse 
vector with support N and let y := Ax + w with ||w;|| < e. 



Also, let A := TV \ T and A e := T\N. Let x modcs denote 
the solution of ([9). If 

• *|*|+|A|+|A.| < V2- 1 and |A| < \N\/3, 
then 

lis - x modcs \\ < d(\N\ + |A| + |A e |)e, where 

For the sake of completeness, and for ease of review, we 
provide a proof in the last appendix, Appendix [F] This can 
later be removed. 

If £|jv|+|A|+|A e | i s J ust smaller than 1, the error bound 
will be very large because the denominator of C\ (S) will be 
very large. To keep the bound small, we need to assume that 
<S|jV|+|A|+|A c | < b(\/2 — 1) with a b < 1. For simplicity, let 
b = 1/2. Then we get the following corollary, which we will 
use in our stability results. 

Corollary 1 (modified-CS error bound): Let i be a sparse 
vector with support N and let y :— Ax + w with ||ui|| < e. 
Also, let A := TV \ T and A e := T\N. Let x modcs denote 
the solution of ([9). If 



'|*|+|A|+|A e | < (V2- l)/2 and |A| < |iV|/3, 



then 



x mod c S \\ < CtQN] + |A| + |A e |)e < 8.796 (12) 



Proof: Notice that Ci(S) is an increasing function of 5s- 
The above corollary follows by using 5|Ar|+|A|+|A c | < (V% — 
l)/2 to bound d(S) by d((V2 - l)/2) = 8.79." 

We can state a similar version of the result for CS iflOl . 

Corollary 2 (CS error bound IHW): Let x be a sparse vec- 
tor with support iV and let y := Ax + w with \\w\\ < e. Let 
x cs denote the solution of Q with T = 0. If 

. S ml < (V2-l)/2, 
then 



||aj-*„|| <Ci(2|7V|)e<8.79e 



(13) 



A. Stability result for modified-CS 

The first step to show stability is to find sufficient conditions 
for a certain set of large coefficients to definitely get detected, 
and for the elements of A e to definitely get deleted. These can 
be obtained using Corollary [TJ and the following simple facts 
which we state as a proposition. 

Proposition 1 (simple facts): Consider Algorithm QJ 

1) An i E N will definitely get detected in step [3] if |xj| > 



a + \\x — x r 



This follows since \\x — x 



modes | 



> 



ll^- -^modcslloo ^ \{x Xmodcs)^' 



2) Similarly, all i <G A e (the zero elements of T) will 
definitely get deleted in step [3] if a > \\x — x mo dcs\\. 

Combining the above facts with Corollary [TJ we get the 
following lemma. 

Lemma 2: Let a; be a sparse vector with support N and 
let y := Ax + w with ||w|| < e. Also, let A := N \ T and 
A e :=T\N. 

Assume that |7V| = S N , |A e | < ^Ae and |A| < S A - 
Consider Algorithm QJ 
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1) Let L := {i G N : \x t \ > bi}. All elements of L will 
get detected in step [3] if 

a) 5s N +s^+s± < (a/2 - l)/2 and S A < SV/3, and 

b) 61 > a + 8.79e. 

2) In step[3] there will be no false additions, and all the true 
removals from the support (the set A e ) will get deleted 
at the current time, if 

a) 6s N +s^+S± < (V2- l)/2 and S A < SW/3, and 

b) a > 8.79e. 

In the above lemma and proposition, for ease of notation, 
we have removed the subscript t from xt, N t , T t and A t . 

We use the above lemma to obtain the stability result 
as follows. Let us fix a bound on the maximum allowed 
magnitude of a missed coefficient. Suppose we want to ensure 
that only coefficients with magnitude less than 2r are part of 
the final set of misses, At, at any time, t and that the final 
set of extras, A e ,t is an empty set. In other words, we want 
to find conditions to ensure that At C St (2) and |A ei t| = 0. 
Using Signal Model [TJ |5 t (2)| = 2S a and thus A t C 5 t (2) 
will imply that |A t | < 25a. This leads to the following result. 
The result can be easily generalized to ensure that, for some 
d < d, A t C S t (d ), and thus |A t | < (2d - 2)S a , holds at 
all times t. We show how to do this for the result of the next 
section in Appendix [Dl an analogous thing can be done for 
Theorem QJ as well. 

Theorem! (Stability of modified-CS): Assume Signal 
Model [TJ on x t . Also assume that y t satisfies ([TJ) with 
1 1 Wt|| < e - Consider Algorithm Q] If the following hold 

1) (support estimation threshold) set a = 8.79e 

2) (support size, support change rate) So, S a satisfy 
5 So+3Sa < (a/2 - l)/2 and S a < So/6, 

3) (new element increase rate) r > G, where 



4) (initial time) at t — 0, no is large enough to ensure that 
A C 5 (2), |A | < 2S a , |A e , | - and |f | < S 

then, 

1) at all t > 0, \ f t \ < So, |A e t | = 0, A f C S t (2) and so 
|A t | < 2S a , 

2) at all t > 0, \T t \ < S , |A e , t | < S a , and |A t | < 2S a , 

3) at all t > 0, ||xf - x t , m odcs\\ < 8.79e 

Proof: The complete proof is given in Appendix [B] It 
follows using induction. We use the induction assumption; 
the fact that T t = fi-i = N t -i; and the fact that N t = 
N t -i UA\Ri to bound \T t \, \A t \ and |A e)t |. Next, we use 
these bounds and Lemma [2] to bound |A t | and |A e t |. Finally 
|T t | < \N t \ + |A e , t | helps to bound \f t \. 

B. Discussion 

Remark 3: We note that condition @] is not restrictive. It is 
easy to see that this will hold if the number of measurements 
at t = 0, no, is large enough to ensure that the measurement 
matrix at t = 0, Aq, satisfies <52S (^4o) < (V2 — l)/2 and 
conditions [TJ and [3] hold. 

Notice that all the support errors are bounded by 2^ or 
less. Under slow support change, S a <C So and so 2S a is also 



small compared to the support size, So, making the above 
result a meaningful stability result. 

Let us compare the results for modified-CS and simple CS. 
Since simple CS is not a recursive approach (each time instant 
is handled separately), Corollary [2] is also a stability result 
for it. From Corollary |2] simple CS needs S2s Q < (\/2 — 
l)/2 to ensure that its error is bounded by 8.79e for all t. 
On the other hand, for t > 0, our result from Theorem [TJ 
only needs S a < So/6 and Ss a +3s a < (V2 — l)/2 to get the 
same error bound. Under S a -C So (slow support change), 
S a < So/ '6 easily holds and <5s +3S Q < (\/2— l)/2 is clearly 
weaker than the simple CS condition. Thus, at t > 0, for a 
given measurement matrix A, modified-CS error is guaranteed 
to remain below 8.79e for larger support sizes, So, than for 
simple CS. Said another way, for a given So, modified-CS 
needs fewer measurements (only enough to satisfy 5s +3s a < 
(\/2 — l)/2), than simple CS (which needs enough to satisfy 
S2S0 < (\/2-l)/2). 

At t = 0, the modified-CS algorithm of Algorithm [TJ needs 
the same number of measurements as simple CS. If reliable 
prior support knowledge were available at t = 0, one would 
need fewer measurements even at t = 0. 

The above discussion only compares sufficient conditions. 
We back it up with actual simulation comparisons in Fig. |3(a)| 
and |3(b)| where we compare the average reconstruction error 
when n is just large enough to ensure small (less than 0.5%) 
error for modified-CS. With this n, the CS error is between 
20-30%. Here "error" refers to normalized mean squared error 
(NMSE). The simulation details are given in Sec. IVIII 

C. Limitations 

Before going further, let us discuss the limitations of the 
above result and of modified-CS itself. First, in Proposition 
[TJ and hence everywhere after that, we bound the norm 
of the error by the £2 norm. This is often a loose bound and 
results in a loose lower bound on the required threshold a 
and consequently a larger than required lower bound on the 
minimum required rate of coefficient increase/decrease, r. 

Second, modified-CS uses single step thresholding for es- 
timating the support N t . The threshold, a, needs to be large 
enough to ensure correct deletion of all the removed elements 
and no false detection of zero elements (condition [TJ. But this 
means that the magnitude increase rate, r, needs to be even 
larger to ensure correct detection, and no false deletion, of all 
but the smallest 2S a nonzero elements (condition [3}. 

There is another related issue which is not seen in the 
theoretical analysis because we only bound the £2 norm of 
the error, but is actually more important since it affects the 
reconstruction itself, not just the sufficient conditions for its 
stability. This has to do with the fact that Xt,modcs is a biased 
estimate of x t . A similar issue for noisy CS, and a possible 
solution (Gauss-Dantzig selector), was first discussed in |[32l . 
In our context, along T c , the values of Xt,modcs will be biased 
towards zero (because we minimize || (/?)t c II 1), while, along 
T, they may be biased away from zero (since there is no 
constraint on {0)t)- The bias will be larger when the noise 
is larger. This will create the following problem. The set T 
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contains the set A e which needs to be deleted. Since the 
estimates along A e may be biased away from zero, one will 
need a higher threshold to delete them. But that would make 
detection more difficult, especially since the estimates along 
A C T c will be biased towards zero. In the next section, we 
discuss a partial solution to this and the previous issue. 

V. MODIFIED-CS WITH ADD-LS-DEL AND ITS STABILITY 

The last two issues mentioned above in Sec. IIV-CI can be 
partly addressed by replacing the single support estimation 
step by a three step Add-LS-Del procedure summarized in 
Algorithm^ This idea was first introduced in our older work 
03], lfT3l for recursive sparse reconstruction and simultane- 
ously also in 11301 . BTI for greedy algorithms for static sparse 
reconstruction. It involves a support addition step (that uses a 
smaller threshold), as in (fl"5l l, followed by LS estimation on 
the new support estimate, T al jd, as in ([Tol l, and then a deletion 
step that thresholds the LS estimate, as in ( TPTl i. This can be 
followed by a second LS estimation using the final support 
estimate, as in ( fl8l l, although this last step is not critical. The 
addition step threshold, a a dd, needs to be just large enough 
to ensure that the matrix used for LS estimation, At. iM is 
well-conditioned. If a a dd is chosen properly and if n is large 
enough, the LS estimate on T a dd will have smaller error than 
the modified-CS output. As a result, deletion will be more 
accurate when done using this estimate. This also means that 
one can also use a larger deletion threshold, add, which will 
ensure quicker deletion of extras. We summarize the algorithm 
in Algorithm [2] Notice the reduction in error of modified-CS 
with add-LS-del as compared to modified-CS in Fig. [3] 



Algorithm 2 Modified-CS with Add-LS-Del 

For t > 0, do 

1) Simple CS. If t = 0, set T = and compute x t . modes 
as the solution of ©. 

2) Modified-CS. If t > 0, set T = N t -i and compute 
xt,modcs as the solution of 

3) Additions / LS. Compute T a dd and the LS estimate using 
it: 

Tadd = T U {i e T c : \(x t ,modcs)i\ > «add} (15) 

(it,add)T lldd = A T Jy u (£*,add)T; dJ = (16) 

4) Deletions / LS. Compute T and LS estimate using it: 

T = T ad d \ T add : \(x t ,adi)i\ < "del} (17) 
(x t )f = Afiy u (x t )fc = (18) 

5) Set N t = T. Feedback N t . Output either it or Xt,modcs- 

Definition 4 (Define T add , x , £\ a dd.t^e,add,t): The set T ad d,t 
is the support estimate obtained after the support addition 
step. It is defined in ( TT3T > in Algorithm [2] The set A a dd,t := 
Nt \T a dd,t denotes the set of missing elements from T a dd,t and 
the set A e;a dd,t := Tadd.t \ Nt denotes the set of extras in it. 
We remove the subscript t where not needed. 



A. Stability result for Modified-CS with Add-LS-Del 

The first step to show stability is to find sufficient conditions 
for (a) a certain set of large coefficients to definitely get 
detected, and (b) to definitely not get falsely deleted, and (c) 
for the zero coefficients in T a dd to definitely get deleted. These 
can be obtained using Corollary Q] and the following simple 
facts which we state as a proposition, in order to easily refer 
to them later. In the proposition and the three lemmas below, 
we remove the subscript t for ease of notation. 

Proposition 2 (simple facts): Consider Algorithm [2] 

1) An i € A will definitely get detected in step [3] if \x%\ > 
a^M+\\x- x modcs \\. This follows since \\x-x modcs \\ > 

\\x 2<modcs || oo ^ | \X ^modcs)z|- 

2) Similarly, an i £ T a dd will definitely not get falsely 
deleted in step H if \xi \ > a de i + ||(ar- i a dd)T lldd ||- 

3) All i £ A e a dd (the zero elements of T a dd) will definitely 
get deleted if a de i > \\(x - i a dd)T,, dll ||- 

4) Consider LS estimation on the known part of support 
T, i.e. consider the estimate (xls)t = A^y and 
{xls)t" = computed from y := Ax + iv. Let A 

Y 7 where N is the support of x. If ||w|| < e and if 
S\ T \ < 1/2, then \\(x-x L s)t\\ < y/2e + 20w \a\\\xa\\- 
This bound is derived in lfT4l equation (15)] [j. 

Combining the above facts with Corollary Q] we can easily 
get the following three lemmas. 

Lemma 3 (Detection condition): Let i be a sparse vector 
with support N and let y := Ax + w with \\w\\ < e. Also, let 
A := N \ T and A e :=T\N. 
Assume that \N\ = S N , \A e \ < S Ae , |A| < S A - 
Consider Algorithm |2] For a given b±, let 

L := {i e A : \xi\ > 6 X }. 

All elements of L will get detected in step [3] if 

1) 5 Sn +s &c +s a < (\/2- l)/2 and Sa < SV/3, and 

2) bx > a add + 8.79e. 

Proof: This lemma follows from fact Q] of Proposition [2] 
and Corollary Q] 

Lemma 4 (No false deletion condition): Let i be a sparse 
vector with support and let y := Ax + w with ||w|| < e. 
Also, let T a dd, A a dd, A e ^ a dd be as defined in Definition [4] 
Assume that |T a dd| < St and |A a dd| < 5*a- 
Consider Algorithm [2] For a given b\, let 

L := {i e T^d : \xi\ > h}. 

No element of L will get (falsely) deleted in step [4] if 

1) S St < 1/2 and 

2) &i > adei + V2e + 2$s T ,SA\\ x A^i\\- 

Proof: This lemma follows directly from fact [2] and fact |4] 
(applied with T = T a dd and A = A a dd) of Proposition |2] 

Lemma 5 (Deletion condition): Let i be a sparse vector 
with support N and let y := Ax + w with < e. Also, let 
Tadd, Aadd, A e .add be as defined in Definition 0] 
Assume that |T a dd| < St and | A add | < S A - 

'instead of Siti < 1/2, one can pick any b < 1 and the constants in the 
bound will change appropriately. 
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Consider Algorithm [2] All elements of A e al jd will get deleted 
in step |4] if 

1) 6 St < 1/2 and 

2) a de i > \/2e + 26s T ,s & \\xA, M \\- 

Proof: This lemma follows directly from fact [3] and fact |4] 
(applied with T = T a dd and A = A a dd) of Proposition [2] 

Using the above lemmas and the signal model, we can 
obtain sufficient conditions to ensure that, for some do < d, 
at each time t, A t C S t (d ) (so that |A t | < (2d - 2)S a ) 
and |A e t | = 0, i.e. only elements smaller than dor may be 
missed and there are no extras. For notational simplicity, we 
state the special case below which uses do = 2. The general 
case is given in Appendix [D] in Corollary |4] In fact, this is 
the generalized version of Corollary [3] which relaxes some 
assumptions of the result below. 

Theorem 2 (Stability of modified-CS with add-LS-del): 
Assume Signal Model Q] on x t . Also assume that y t satisfies 
O with \\w t \\ < e. Consider Algorithm [2] If 

1) (addition and deletion thresholds) 

a) a a dd is large enough so that there are at most S a 
false additions per unit time, 

b) a de i = V2e + 2sJ~S^6 Sa+ 2S a ,S a r, 

2) (support size, support change rate) So, S a satisfy 

a) 5 So+3Sa < (V2- l)/2 and S a < S /6, and 

b) 8s +2S a .s a < 



2 2^~S~ a 



3) (new element increase rate) r > max(Gi, G2), where 

8.79e 



a a dd 



Go 4 



V2e 



(19) 



1 — 2VS^0 So +2S a ,s a 

4) (initial time) at t = 0, no is large enough to ensure that 
Ac C 5 (2), |A | < 2S a , |A e , | = 0, and |f | < So, 

then, 

1) at all t > 0, \f t \ < S , |A M | = 0, A t C «S t (2) and so 
|A t | < 2S a , 

2) at all t > 0, \T t \ < S , |A M | < S a , and |A t | < 2S a , 

3) at all t > 0, |r add , t | < 5 + 2^ a , |A e , add , t | < 2S a , and 

I A a dd, 1 1 < <S*a 

4) at all t>0,\\x t - x t \\ < V2e + (26» So ,2S a + l)y/2Sjr 

5) at all i > 0, \\x t - ^, OTodcs || < 8.79e 

Proof: The complete proof is given in Appendix [C] This 
proof also follows by induction, but is more complicated than 
that of Theorem Q] The induction step consists of three parts. 

• First, we use the induction assumption; the fact that T t = 
f t -i = N t -i\ and the fact that N t = N t -i U A t \Ut 
to bound \T t \, |A 6jt |, |A t |. This part of the proof is the 
same as that of Theorem Q] The next two parts are quite 
different. 

• We use the bounds from the first part; equation (|6); 
Lemma [3j the limit on the number of false detections 
from condition [Tal and |T a dd| < \N\ + |A 6ia dd| to bound 

I A a dd,t I, |A eja dd,t|, |?add,t|. 

• Finally, we use the bounds from the second part; Lemmas 
HandEJ and \f\ < |iV| + |A e | to bound |A t |, |A e , t |, \f t \. 



B. Discussion 

Notice that condition [2b] may become difficult to satisfy as 
soon as S a increases, which will happen when the problem 
dimension, m, increases, and consequently So increases, even 
though 5* a and So remain small fractions of m, e.g. typically 
S ~ 10%m and S a « 2% - lO^oS'o « 0.2% - l%m. The 
reason we get this condition is because in facts [2] and [3] of 
Proposition [2] and hence also in Lemmas @] and |5] and in 
the final result, we bound the £oo norm of the LS step error, 
(x — iadd)T ldd , by its £2 norm. This is clearly a loose bound. 
It holds with equality only when the entire LS step error is 
concentrated in one dimension. 

In practice, as observed in our simulations, the LS step error 
is actually quite spread out, since the LS step tends to reduce 
the bias in the estimate, at least as long as the number of 
misses in T a dd is small and Ax M is well conditioned (which are 
required conditions for stability anyway and are enforced by 
conditions[3landl2alof Theoreml2l. Thus, it is not unreasonable 
to assume that \\(x - iadd)^ |U < C\\(x - x. idd ) TiM \\ for 
some C < 1. From simulations, it is observed that C = 
works. Here Cm is slightly more than one and increases very 
slowly with m, e.g. for m = 200, Cm = Lll, for m = 1000, 
Cm = 1.23 and for m = 2000, Cm = L38. The above numbers 
were obtained when we simulated according to the generative 
model for Signal Model Q] given in Appendix IA11 we used 
So = 0.1m, S a = 0.01 to, and r = 1; the matrix A was random 
Gaussian, with n = O.386IS0 log 2 to; the noise, wt, was 
independent identically distributed (i.i.d.) uniform(—c, c) in 
various dimensions and over time and we used c = 0.1266; 
and we set a a dd = c/2 and add = r/2 0. Similar conclusions 
were obtained for r = 3/4 and 2/3. 

With using the extra assumption ||(ai — x a dd)T add ||oo < 
-%=||(a; — i a dd)n, dll || in facts [2] and [3] of Proposition [2] Lemmas 
|4| and [5] get replaced by the following two lemmas. With 
using these new lemmas, condition [2b] of Theorem [2] will get 
replaced by 9s +2S a ,s a < j^— which is an easily satisfiable 
condition. Moreover, this also makes the lower bound on 
the required value of r (rate of coefficient increase/decrease) 
smaller. 

Lemma 6 (No false deletion condition - weaker): Let x be 
a sparse vector with support N and let y := Ax + w, 
with < e. Also, let T a dd, A a dd, A 6ja dd be as defined in 
Definition @] 

Assume that |T a dd| < St and |A a dd| < Sa- 

Consider Algorithm [2] Assume that the LS step error is spread 

out enough to ensure that 

\\(X - £add)T add ||oo < 4=11(2: - XaddkJI- 

For a given b\, let 

L := {i e Ikd : > h}- 
No elements of L will get (falsely) deleted in step [4] if 

4 We computed f m by computing the maximum of 



— over time and over 500 independent simulations 

IIOt-Zad d ,t)T add t ll 

for m = 200 (and over 50 for m = 1000, 2000). The matrix A was chosen 
once and fixed. We sampled over the distributions of wt and xt . 
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1) S St < 1/2, and 

2) h > a d ei + ^(\/2e + 20 Sr , s J|a;A,J|). 

Lemma 7 (Deletion condition - weaker): Let a; be a sparse 
vector with support N and let y := Ax + w, with \\w\\ < e. 
Also, let T a dd, A a dd 7 A e a dd be as defined in Definition [4] 
Assume that |T a dd| < St and | A add | < Sa- 
Consider Algorithm [2] Assume that the LS step error is spread 
out enough to ensure that 



I (a; - £add)T ad Joo 



< 



\(x - x add ) TilSi \\- 



All elements of A e a dd will get deleted in step [4] if 

1) 6 St < 1/2 and 

2) a del >^(V2e + 2e ST , SA \\x Ai J\). 

By using Lemmas [6] and [7] instead of Lemmas [4] and [5] 
respectively, and doing everything else exactly as in the proof 
of Theorem [2] we get the following corollary. 

Corollary 3 (Stability of modified-CS with add-LS-del - 2): 
Assume Signal Model Q] on xt- Also assume that y t satisfies 
(HJ with ||w t || < 6. Let 

e t := (x t - x- ddd , t )T. iM , t 

denote the LS step error. Assume that the LS step error is 
spread out enough so that 

Cr, 



< 



(20) 



at all times, t. Consider Algorithm [2] If 

1) (addition and deletion thresholds) 

a) a a dd is large enough so that there are at most S a 
false additions per unit time, 

b) a de i = yJ^Cme + 20s o +2S a ,Sa(rnr, 

2) (support size, support change rate) So, S a satisfy 

a) S So+3Sa < (a/2 - l)/2 and S a < So/6, and 

b) So+2 s a ,s a < 4^- 

3) (new element increase rate) r > max(Gi, G2), where 



„ a "add + 8.79e 

= 7, 



Go 



(21) 



/Sa(l - 20s o +2S a ,S a (m) 

4) (initial time) at t — 0, no is large enough to ensure that 
A C 5 (2), |A| < 2S a , \A e \ = 0, \f\ < S , 
then all conclusions of Theorem [2] hold. 

A generalization of the above corollary, that allows the 
support error to stabilize at (2do — 2)S a , for some do < d, is 
given in Appendix [D] in Corollary |4] 

Recall that Cm/VSi is smaller than one. For example, 
in our simulations, when m = 2000, Cm = 1-38, while 
y/S~2 = V20 = 4.47. Also, C, n increases very slowly with 
m (slower than 0(logm)) where as \fS^ typically increases 
as \frii. Thus, conditions [Tbl |2bl and [31 are significantly weaker 
compared to those in Theorem [2] while others are the same. 
In particular, now condition [2b] is easy to satisfy. 

Let us compare this result with that for modified-CS given 
in Theorem Q] Consider the lower bound on r required by 



both results. In the above result, since 0s o +2S a ,s a < l/(4(m)> 
so G 2 < ^hl £ < 2.9e < ^ < Qx and thus Gi is what 
decides the minimum allowed value of r. Because of add-LS- 
del, the addition threshold, a a dd, can now be much smaller, as 
long as the number of false adds is smaljfl If «add is close to 
zero, the value of Gi is almost half that of G in Theorem Q] 
Thus the minimum coefficient increase rate, r, required by the 
above result is almost half of that required by Theorem Q] On 
the other hand, the above result also requires condition l2~b1on 
6 which Theorem Q] does not, but this condition is typically 
weaker than condition [2a] since 0s o +2S a ,s a is smaller than 
5s +3s a where as the right hand sides do not differ by much. 

The above is also demonstrated in Figs. |3(b)| and |3(c)| For 
r = 1, both are stable, but for r = 2/3, modified-CS is 
unstable while modified-CS with add-LS-del is still stable. 

Finally, let us compare our result with the simple CS result 
given in Corollary [2] Corollary |2]needs c>2s < (v2— l)/2 = 
0.207 to achieve the same error bound as our result. On the 
other hand, if the LS step error is spread out enough, we 
only need £ So +3S a < (V2- l)/2 - 0.207 and 6 So+2 s a ,s a < 
l/(4£ m ). When S a <C Sq (slow support change), the first 
condition is clearly weaker than what CS needs. The second 
condition is also weaker since 9s +2S a ,s a is significantly 
smaller than 62s where as the right hand sides 0.207 and 
0.25/C rn are roughly equal. A quantitative comparison can be 
done by using the upper bounds 6 Ui k < E and S c k < 

cS 2k EQ. If S a = 0.02S , then 5 2So = $ioos a < l00S 2Sa 
and So +2S a ,s a < ^s +3S a < 53<5 2 s a - Thus, the CS condition 
is stronger as long as Cm < (100/53)(0.25/0.207) = 2.28. If 
S a = 0.15*0, then the CS condition is stronger if £ m < 1.9. 

Remark 4: In the discussion so far we have used the special 
case stability results where we find conditions to ensure that 
the misses remain below 2S a . Let us look at the general form 
of the result - Corollary |4]in Appendix [D]- where we provide 
conditions to ensure that, for some do < d, the misses are 
below (2do — 2)S a - In Corollary @] using an argument similar 
to the one above, G2 < Gi holds for any do- Also, notice that, 
if the rate of coefficient increase, r, is smaller, r > Gi will 
hold for a larger value of do- This means that the support error 
bound, (2<io — 2)S a , will be larger. This, in turn, decides what 
conditions on S and 8 are needed (in other words, how many 
measurements, n, are needed). Smaller r means a larger do is 
needed which, in turn, means that stronger conditions on S, 8 
(larger n) are needed. Thus, for a given n, as r is reduced, the 
algorithm will stabilize to larger and larger support error levels 
(larger do) and finally become unstable (because the given n 
does not satisfy the conditions on S, 8 for the larger do)- 

The above is demonstrated empirically in Fig. [3] The last 
three rows of this figure used 77 = 59. When r = 1, modified- 
CS with add-LS-del is stable at zero support errors. When r 
is reduced to 2/3, it is stable at mean support errors less than 
0.3%. When r is reduced to 2/5 it becomes unstable. 



5 e.g. in simulations with m = 200, So = 20, S a = 2, r = 1 (or even for 
r = 2/3), n = 59, (w t )j i uniform(-c,c) with c = 0.1266, and 
<*del = r/2, we were able to use o a dd = c/2 = 0.06 and still ensure that 
the number of false adds is less than or equal to S a (details in Sec. IVIIt . 
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VI. Stability of LS-CS 



then 



In O, |28l , we introduced Least Squares CS-residual (LS- 
CS) as one of the first solutions to the problem of recur- 
sively reconstructing sparse signal sequences with slow time- 
varying sparsity patterns. We summarize the complete LS- 
CS algorithm in Algorithm [3] LS-CS uses partial knowledge 
of support, T, in a different way than modified-CS. It first 
computes an initial LS estimate on the set T, as in d22l i. and 
then computes the observation residual, as in (|23V Noisy CS is 
done on this observation residual, as in (1241 . and the solution is 
added back to the initial LS estimate, as in d25l l. The add-LS- 
del approach described earlier is used for support estimation. 



Algorithm 3 Least Squares CS-residual (LS-CS) 

For t > 0, do 

1) Simple CS. Do as in Algorithm [2 

2) CS-residual. 

a) Use T := Nt-i to compute the initial LS estimate, 
%t,imt> anc l tne residual, ijt,m$, as follows. 

(x t ,imt)T = A T ^Vt, (it,init)T- = (22) 
2/t,res =Vt- Ax t ,imt (23) 

b) Do noisy CS on the LS residual, i.e. solve 

ffijn||0||i s.t. \\y t>Tes - AP\\ <e (24) 



and denote its output by /3f. Compute 

Xt.CSles := Pt + x tMt- 



(25) 



3) Additions / LS. Compute and the LS estimate on it 
as in Algorithm |2] Use x^csres instead of Xt modes f° r 
estimating X^d- 

4) Deletions / LS. Compute T and the LS estimate on it as 
in Algorithm |2 

5) Set N t = f. Output x t . Feedback N t . 



The CS-residual step error, x< — £t,csres> where x^csres is 
defined in d251 l. can be bounded as follows. The proof is easy 
and follows in the same way as that for j 14, Corollary 1] where 
noisy CS is done using Dantzig selector instead of (1241 1. We 
use (l24l here to keep the comparison with modified-CS easier. 

Lemma 8 ( CS-residual error bound M41I ): Let x be a 
sparse vector with support N and let y := Ax + w with 
\\w\\ < e. Also, M A := N\T and A e := T \ N. Consider 
step |2 of Algorithm If 



5 2 |A| < (V2- l)/2 and 

5|r| < 1/2, 



|x-x C sre S || < C'e + 0\ T \ t \ A \C"\\xA\\, where 
C = C'(\T\, |A|) 4 d(2|A|) + V2C 2 (2\A\)J^ 



C"' = C"(|r|,|A|) 4 2C 2 (2|A 



d(S) is defined in O, C 2 (S) = 2 1 + ^ ^ S ( 




1-(V2 + 1)6 S 



A. Stability result for LS-CS 

Our overall approach is similar to the one discussed in 
the previous section for modified-CS with add-LS-del. The 
key difference is in the detection condition lemma, which we 
give below. Its proof is given in Appendix [E] This lemma is 
different from Lemma |2because, unlike modified-CS, the CS- 
residual error bound at time t also depends on the magnitudes 
of the elements in the initial missed set A t . 

Lemma 9 (Detection condition for LS-CS): Let x be a 
sparse vector with support N and let y := Ax + w with 
\\w\\ < e. Also, let A := N \ T and A e := T \ N. Assume 
that \T\ < St and |A| < Sa- Assume that ||xa||oo < b. 
Consider step [3] of Algorithm^ For a 7 < 1, let 



L y := {i G A : 76 < |x ?; | < b} 



and let 



L 2 := A\ Li = {i G A 



< 7^}- 



Assume that \Li\ < Sli and ||xi 2 || < Kb. All i G L\ will 
definitely get detected at the current time if 

1) <W < (\/2-l)/2, 

2) 8 St < 1/2, 

3) max| A |< SA STjA |C"(5T,|A|)<— ^ 
4) 

«add + C'(5 T ,|A|) £ 



1 '-, r , and 

l+K) 



max 



< b 



\a\<s a 7 - e STtlAl c"{s T , |A|)(VSET + «) 

where C (.,.), C"(., .) are defined in Lemma [8] 

Proof: The proof is given in Appendix |E] 

The stability result then follows in the same fashion as 
Theorem |2 The only difference is that instead of Lemma [3] 
we apply Lemma|9]with St = So, Sa = 2S* a , b = 2r, 7 = 1, 
Sn = S a and k = = 

Theorem 3 (Stability of LS-CS): Assume Signal Model Q] 
on Xt. Also assume that y t satisfies ([T]l with ||u> t || < e. 
Consider Algorithm |2 If 

1) (addition and deletion thresholds) 

a) Oi d dd is large enough so that there are at most S a 
false additions per unit time, 

b) a del = V2e + 2yfS\ l 9 SQ+2Sa ^ a r 

2) (support size, support change rate) So,S a satisfy 

a) S 4Sa < (a/2 - l)/2 

b) 5 So+ 2 Sa < 1/2 

c) max| A |< 2Sa 6» Soi | A |C"(S , o,|A|) < ^= 
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d) Os +2S a ,S a < o 



2 2VSI 



3) (new element increase rate) r > max(Gi, G2), where 
, a a dd + G'(5 ,|A|)e , 



Gi 4 



G 2 



|A|<2S a L 2 - 30 So ,| A | V^G"(5 , |A|) J 
V2e 



(27) 



1 — 2Vda0so+2Sa,,Sa 

4) (initialization) (same condition as in Theorem [2]i 
then, all conclusions of Theorem [2] hold for LS-CS, ex- 
cept the last one. This is replaced by \\xt — i*,csresj| < 
max| A |<2s Q [G'(5o, |A|)e + (#s ,|A| 

B. Discussion 

Notice that conditions [2c] and [2d] are the difficult conditions 
to satisfy as the problem size, to, increases and consequently 
Sq and S a increase. We get condition [2d] because we bound 
the £oo norm of the addition LS step error by its £2 norm. This 
can be relaxed to Os +2S a ,s a < V(4Cm) in the same fashion 
as in the previous section. 

Consider condition [2c] We get this condition because (i) 
we upper bound the £oo norm of the CS -residual step error, 
%t — £t,csres, by its I2 norm in Lemma [9] and (ii) in the proof 
of Lemma [8] we upper bound the l\ norm of the initial LS 
step error, (x t — x t ,mit)T, by \f\T\ times its I2 norm (this 
results in the expression for C" given in Lemma [HJ. If we 
can relax (i), we can try to weaken the required condition, 
but it will still be stronger than what modified-CS with add- 
LS-del or modified-CS need. For example, if we can assume 
a bound similar to d20b for the CS-residual step error, and 
if additionally, we assume that, in the range |A| < 2S a , 
Os ,\A\C"(So, |A|) is largest for |A| = 25 a , condition [2c] will 

get relaxed to something like 0s o ,2S a C2{2S a ) < 3^— \J^§§^- 
This is still stronger than condition[2b]of Corollary[3] primarily 
because of yj S a /So- 

The above is also observed in our simulations. In Fig. [3] 
LS-CS needs a larger n (n = 65) for stability where as for 
modified-CS with add-LS-del or modified-CS, n = 59 suffices. 
We show the results for r = 1 or lower, but even when we 
increased r to r = 2 or r = 3, LS-CS was still unstable with 
n = 59. The simulation details are given in Sec. IVIII 

VII. Simulation Results 

We compared modified-CS (mod-CS), as given in Algorithm 
CD modified-CS with Add-LS-Del (mod-CS-add-LS-del), as 
given in Algorithm [2] (with final output it), LS-CS, as given 
in Algorithm [3] and simple CS for a few different choices 
of n and r. The results are shown in Fig. [3] where we show 
four rows of plots. In each row, we plot the normalized mean 

, the normalized mean 



n\\x t 



squared error (NMSE), 

extras, grnyrp > an d the normalized mean misses, grnyrp 
in the left, middle and right columns respectively. Here E[.] 
denotes the empirical mean over the 500 realizations. 

In all rows, we used the generative model for Signal Model 
Q] from Appendix \M\ with m = 200, S = 20, S a = 2. 
The measurement noise, (w t )j uniform(—c,c) with 



c = 0.1266, i.e. it was i.i.d. uniform in all dimensions and 
over time. Each element of the measurement matrix, A, was 
i.i.d. zero mean random Gaussian. Fig. |3(a)| used n = 65, 
r = 1 and d = 3, while the other three rows used n = 59. 
Fig. |3(b)| used n = 59, r = 1, d = 3; Fig. |3(c)| used n = 59, 
r = 2/3, d = 3; and Fig. [3(d)] used n = 59, r = 2/5, d = 5. 
Our simulations selected A once and kept it fixed, but Monte 
Carlo averaged over Wt and x t . 

We set the addition threshold, a a dd, to be at the noise level 
- we set it to c/2. Assuming that the LS step after support 
addition gives a fairly accurate estimate of the nonzero values, 
one can set the deletion threshold, a<iei, to a larger value of 
add = r/2 and still ensure that there are no (or very few) false 
deletions. Larger deletion threshold ensures that all (or most) 
of the false additions and removals get deleted. Modified-CS 
used a single threshold, a, somewhere in between a a dd and 
ctdei- We set a = ((c/2) + (r/2))/2 (we picked this after trying 
a few different options for a). Also, we did not do anything 
at t = 0. We just started our simulation at t = 1 with the 
assumption that |Ao| = 2, |A e o| = and hence |To| = Sq — 2 
(i.e. the initial time condition of all our theorems holds). 

Notice, from the plots, that LS-CS needs at least n = 65 
for stability (compare Fig. |3(a)| with Fig. |3(b)| i where as mod- 
CS-add-LS-del and mod-CS are stable even with n = 59. We 
also tried using n — 59 and larger values of r, but even with 
r = 3 LS-CS was still unstable in a few cases. 

Secondly, even with n = 65, simple CS NMSE is about 
20% where as mod-CS-add-LS-del and LS-CS are stable at 
0.1% and mod-CS is stable at 0.3%. We do not show support 
recovery errors for simple CS since they were very large. With 
n = 59, simple CS NMSE goes up to 30%. We also show the 
NMSE plot for simple Gauss-CS (CS followed by a final LS 
step on the estimated support, done in a fashion similar to 
Gauss-Dantzig selector H321 ). Since the CS error itself is so 
large, this debiasing step does not help. 

When n = 65 and r = 1, mod-CS is stable, but has larger 
error than both LS-CS and mod-CS-add-LS-del. When n = 59 
and r = 1, LS-CS becomes unstable. But, mod-CS and mod- 
CS-add-LS-del are still stable, with mod-CS being stable at a 
larger error (both larger support error and MSE) than mod-CS - 
add-LS-del. When r is reduced to 2/3, mod-CS also becomes 
unstable. But mod-CS-add-LS-del is still stable, though at 
higher error values than when r = 1. When r is further reduced 
to 2/5, even mod-CS-add-LS-del becomes unstable. 

Mod-cs-add-LS-del uses a better support estimation method 
and thus its extras and misses are both much smaller than those 
of mod-CS. As a result, (a) it can remain stable for smaller 
values of r than mod-CS; and (b) when both are stable, its 
reconstruction error is smaller than that of mod-CS. 

VIII. Conclusions 

Under mild assumptions, we showed the "stability" of 
modified-CS and its improved version, modified-CS with add- 
LS-del, and of LS-CS for recursive sparse signal sequence 
reconstruction. By "stability" we mean that the number of 
misses from the current support estimate and the number of 
extras in it remain bounded by a time-invariant value at all 
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times. Under slow support change, the results are meaningful, 
i.e. the bound is small compared to the support size. A direct 
corollary is that the reconstruction errors are also bounded by 
time-invariant and small values. 

We can argue that our results ensure stability under weaker 
assumptions that those required by simple CS. We are also 
able to compare the implications of the results for the three 
recursive algorithms and argue that modified-CS with add- 
LS-del needs the weakest conditions on both the number of 
measurements, n, and on the rate of coefficient magnitude 
increase/decrease, r. Modified-CS needs similar conditions 
on n, but needs r to be larger. LS-CS needs the strongest 
conditions on both n and r. All of our conclusions are 
supported by empirical performance evaluations that compare 
the reconstruction error as well as the support recovery errors 
using Monte Carlo simulations. 

Two open questions that remain are as follows. The first 
is how to show stability for a stochastic model of signal 
change that models small random variations around the mean 
number of support additions/removals and around the mean 
magnitude increase/decrease rate. A second open question is 
to show stability under reasonable assumptions for approaches 
that also use slow signal value change, e.g. KF-CS ITS! , |[T4ll 
or regularized modified-CS |[T6l or of ifPTl . 

Appendix 
A. Generative Models for Signal Model [7] 

To help understand Signal Model Q] better, we provide here 
two possible generative models that satisfy its assumptions. In 
both cases, at t = 0, the support size is So and it contains 2S a 
elements each with magnitude r, 2r, . . . (d — l)r, and (So — 
(2d — 2)S a ) elements with magnitude M. 

1) Generative Model 1: This assumes that when a new ele- 
ment gets added to the support, its magnitude keeps increasing 
at rate r until it reaches M := dr. An analogous model is 
assumed for decrease until removal from support. The sign is 
selected as +1 or —1 with equal probability when the element 
gets added to the support, but remains the same after that. 

Mathematically this can be described as follows. Let 
(xt)i = (mt)i(st)i where (mt)i denotes the magnitude and 
(st)i denotes the sign of (xt)i at time t. 
At any t > 0, do the following. 

1) Update 

l t (j) = It-xfJ - 1), for all 2<j<d, and 
V t (j) = Vt-iU + 1), for all < j < d - 2 (28) 

where It(j) and T> t (j) are defined in Definition [3] 
Recall that the removed set, Tl t = 2?t(0). 

2) Generate 

a) the new addition set, At = of size S a 
uniformly at random from Nt-i°, and 

b) the new decreasing set, V t (d — 1), of size S a 
uniformly at random from { i 6 N-i : (x t -\)i = 
M}. 



3) Update the coefficients' magnitudes as follows. 

r (m t _ 1 ) i + r, i e U 3 d =1 I f (j) 
(m t )i = I (m t -x)i -r, i e U^V t (j) (29) 
I (m t -i)i, i £ C t 

where C t := N t \ {uf =1 l t (j) U U^V t (j)}. 

4) Update the signs as follows. 

r (s t -i)i, ieN t \A t 
(s t )i = { itd(±l), i e At (30) 
I 0, i € N t c 

where iid(±l) refers to generating the sign as +1 or -1 
with equal probability and doing this independently for 
each element i. 

5) Set (x t )i = (m t )i(st)i for all i. 
Our simulations used the above model. 

2) Generative Model 2: A second reasonable generative 
model selects any S a out of the 2S a elements with current 
magnitude jr and increase them, and decreases the other S a 
elements. In other words, it replaces the first step above by 
the following, while keeping the rest of the steps the same. 

1 ) Generate 

a) It(j) of size S a uniformly at random from {i G 
N t -i : (x t -i)i - (j - l)r} for all 2 < j < d. 

b) T> t (j) of size S a uniformly at random from {i € 
iV t _i : (x t -i)i = (j + l)r} for all < j < d - 2. 

B. Appendix: Proof of Theorem Q] 

We prove the first claim by induction. Using condition |4] of 
the theorem, the claim holds for t = 0. This proves the base 
case. For the induction step, assume that the claim holds at 
t - 1, i.e L |A M _i| = 0, \f t -i\ < So, and A t _i C 5t_i(2) 
so that |A t _i| < 2S a . Using this we prove that the claim 
holds at t. In the proof, we use the following facts often: (a) 
TZ t C Nt-i and A t C N£_ v (b) N t = N t -iUAt\K t , and (c) 
if two sets B,C are disjoint, then, DUC\B := (DuC)\B = 
(D n B c ) U C for any set D. 

We first bound \T t \, |A e , t |, |A t |. Since T t = f t -i = N t -i, 
so \T t \ < S . Also, A e , 4 = N t -i \ N t = N t -i n [(N^, n 
At) U TZ t ] C A e)t _i U Tit = Tit- The last equality follows 
since |A e ,t_i| = 0. Thus |A e , t | < \H t \ = S v 

Consider |A t |. Notice that A t = N t \ N t -i = (N t -i n 

N t U n n c t ) u (At n N t U) - (A t _i n Tlf) u (A t n nu) c 

(5 t _i(2) n Tlf) UA t = <S t _i(2) UA t \ Tl t . Here we used 
A t _i C St-\(2). Since Tl t C St-x(2) and At is disjoint with 
5t_i(2),thus \& t \<[St-i(2)\ + \At\-\Tlt\=2S a + S a -S a . 

Next we bound |A t |, |A et |, \T t \. Consider the support 
estimation step. Apply the first claim of Lemma [2] with 
Sn = So, Sac = S a , Sa = 2S a , and b\ = 2r. Since 
conditions|2]and[3]of the theorem hold, all elements of Nt with 
magnitude equal to or greater than 2r will get detected. Thus, 
At C St (2). Apply the second claim of the lemma. Since 
conditions [2] and [T] hold, all zero elements will get deleted 
and there will be no false detections, i.e. |A e ,t| = 0. Finally, 
\f t \ < \N t \ + \A e:t \ < So + 0. 

The second claim for time t follows using the first claim 
for time t — 1 and the arguments from the paras above. The 
third claim follows using the second claim and Corollary Q] 
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C. Appendix: Proof of Theorem \2\ 

We prove the first claim of the theorem by induction. Using 
condition |4] of the theorem, the claim holds for t = 0. This 
proves the base case. For the induction step, assume that the 
claim holds at t - 1, i.e. |A e ,4-i| = 0, |T t _i| < S , and 
A t _i C 6>t_i(2) so that |A t „i| < 2S a . Using this, we prove 
that the claim holds at t. We will use the following facts often: 
(a) TZ t C Nt-i, (b) At C Nf_ lt (c) N t = N t -i U A t \ TZt, 
and (d) if two sets B, C are disjoint, then, D U C \ B := 
(D U C) \ B = (D n B c ) U C for any set D. 

The bounding of |T 4 |, |A t |, |A 6jt | is exactly as in the proof 
of Theorem [Q Since T t = T t -i, so |T t | < S . Also, A e , t = 

N t -!\N t = iv t _ 1 n[(iv t c _ 1 n^)u^t] c A e , t -iuii t = ii t . 

Thus |A e , t | < |ft t | = S a . Finally, A t = N t \N t -i = (A t _in 
ft 4 c ) U (A n 2V£_ X ) C («S t _i(2) n ^) U A- Thus, 



A t C5 t _i(2)uA\^t 



(31) 



Since TZt Q <5f_i(2) and A is disjoint with <5> t _i(2), thus 
|A t | < |S t -i(2)| + |A| - \Ht\ = zs a + s a - s a . 

Consider the detection step. There are at most S a false 
detects (from condition [Tall and thus |A e . addj t| < |A e t| + iS < 
2S a . Thus |T addit | < |iV t | + |A e , add)t | <S + 2S a . ' 

Next, consider |A ad d,t|- Notice that 

A* C5 t _ l (2)Ui,\1J t C5 ( (2)Ul ( (2)\D l (l). (32) 

The first C is from OTT i. the second one follows by using (O 
for j = 2. Now, apply Lemma [3] with Sn = So, Sac = S a , 
S A = 2S a , and with bi = 2r. Using §3%, L = A t n 2 t {2). 
Since conditions |2] and |3] hold, by Lemma [3] all elements of L 
will definitely get detected at time t. Thus A add .t Q A t \ L = 
At \2t(2). But from (El, A t \I t (2) C 5 t (2) \ V t (l). Since 
2? t (l) C 5t(2), so |A ad(M | < |5 t (2)| - |X> t (l)| = 2S a - S a . 

Consider the deletion step. Apply Lemma with St — 
So + 2S a , Sa = S a . Since condition l2al holds. <55 0+ 2s Q < 1/2 
holds. Since A add , f C S t (2) \ T> t (l), A add . t contains at 
most S a elements of magnitude r and nothing else. Thus, 



< 



r S a r. Using these facts and condition [Lb] 



by Lemma [5] all elements of A e addit will get deleted. Thus 
|A e>t | = 0. Thus \ f t \ < \N t \ + \A e J < S . 

To bound |A t |, apply Lemma|4]with St = So + 2S a , Sa = 
S a , bi = 2r. By Lemma 2] to ensure that all elements of 
L do not get falsely deleted, we need 5s +2S a < 1/2 and 
2r > a de i + y/2e + 29 So +2S a ,S a .v r S a ~'r. From condition [Tb] 
add = V2e + 26 So +2S a .s a V3a~r. Thus, we need S So+2 s a < 
1/2 and 2r > 2{^2e + 2e So+2Sa , Sa ^S~ a r). 5 So+2 s a < 1/2 
holds since condition [2a] holds. The second condition holds 
since condition [2b] and r > G 2 of condition [3] hold. Thus, we 
can ensure that all elements of L, i.e. all elements of T add t 
with magnitude greater than or equal to bi = 2r do not get 
falsely deleted. But nothing can be said about the elements 
smaller than 2r (in the worst case all of them may get falsely 
deleted). Thus, A t C S t (2) and so |A t | < 2S a . 

This finishes the proof of the first claim. To prove the second 
and third claims for any t > 0: use the first claim for t — 1 
and the arguments from the paragraphs above to show that the 
second and third claim hold for t. The fourth claim follows 
directly from the first claim and fact|4]of Proposition|2](applied 



with x = .it, T = T u A = At). The fifth claim follows 
directly from the second claim and Corollary [T] 

D. Appendix: Generalized version of Corollary \3\ 

Corollary 4 (Stability of modified-CS with add-LS-del - 3): 
Assume Signal Model [TJ and \\wt\\ < e. Let 
et := (xt — i addj t)T ulldjf ■ Assume that the LS step error 
is spread out enough so that 



< 



S a 



at all t. Consider Algorithm [2] If, for some 1 < do < d, 

1) (addition and deletion thresholds) 

a) a add is large enough so that there are at most / 
false additions per unit time, 

b) a de i = -y/^Cme + 2fc 3 6»5 0+ s a+/ , fe2 C m r, 

2) (support size, support change rate) So,S a satisfy 

a) S So+Sa{1+kl) <(V2- l)/2 and S a < ^, 

b) S So+ s a+f < 1/2, 

C) So +S a +f,k 2 S a < 54^Gl' 

3) (new element increase rate) r > max(G 1 , G 2 ), where 
a add + 8.79e 



Gi ^ 



G 2 4 



do 



2y/2C„ 



r Sa{do — 4:k 3 6 So +S a +f,k 2 S a (m) 



(33) 



4) (initial time) no is large enough to ensure that Ao C 
S (d ), |A | < (2d - 2)S a , |A e>0 | = 0, |f | < S , 

where 

k\ = max(l, 2do — 2) 
k 2 = max(0, 2do — 3) 



i 



d -l 

E 

3=1 



f + E i 2 



(34) 



then, 

1) at all t > 0, | ft | < S , |A e>t | = 0, and A t C S t (d ) 
and so | A t | < (2d - 2)S a , 

2) at all t > 0, \T t \ < S , |A e . t | < S a , and |A t | < faS a , 

3) at all t > 0, |T add)t | < S + S a + /, |A e>add)t | < S a + /, 
and \ A Rdd . t \ < k 2 S a 

4) atalU > 0, \\x t -x t \\ < V2e + k 3 y/Sl(29 So , {2da _ 2)Sa + 
l)r 

5) at all t > 0, \\x t - xt >modcs \\ < Ci(S + S a + hS a )e < 
8.79e. 

Proof: The proof follows using exactly the same steps as in 
the proof of Theorem [2] but of course with Lemmas 2] and |5] 
replaced by Lemmas [6] and [7] respectively. The only difference 
is that, instead of ensuring |A e ,t| = and A t C <St(2), we try 
to ensure |A e t | = and A t C St (do) for some do < d. For 
1 < do < d, notice that |5t(do)| = (2do — 2)S a - Also, since, 
now, A add , t C St(do)\V t (do - 1), so |A add , t | < (2d - 3)S a 
and Hxadd.tH < ksS a . The case of do = 1 is handled separately. 
In this case, St(do) is empty, but still A t is not empty, but is 
equal to A- Also, A add t and A t are empty. 
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E. Proof of Lemma [9] 

From Lemma U if \\w\\ < e, 6 2 \a\ < (s/2 - l)/2 
and 5\ T \ < 1/2, then \\x - £cs res || < C'(\T\, |A|)e + 
0|t|,|a|C"(|T|,|A|)||2!a||. Using the fact that ||xa|| < 
y^\Li \b + ||xa 2 ||; fact [TJ of Proposition [2j and the fact that 
for all i G L\, \xi\ > 76, we can conclude that all i G L\ will 
get detected if 



1) <5 2 |A| < (V2-l)/2, 

2) 5| T | < 1/2 and 

3) a add + C'e + eC"{ 
\\xa 2 II < K b an d l-^i 

a) 0C" < 



b) 



2(- v /3Z7+k; 




a;A a ||) < 7 & - Using 
this inequality holds if 



y-eC"{-/SJX+K) 

Since we only know that |T| < St, |A| < <Sa, we need the 
above inequalities to hold for all values of \T\, |A| satisfying 
these upper bounds. This leads to the conclusion of the 
lemma. Notice that the LHS's the first two inequalities are 
non-decreasing functions of |A|, \T\ and thus the lemma just 
uses their upper bounds. The LHS's of the last two are 
non-decreasing in |T|, but are not monotonic in |A| (since 
C'(|T|,|A|) and C"(|T|,|A|) are not monotonic in |A|). 
Hence we explicitly maximize over |A| < Sa ■ 

F. Proof of Lemma [7J 

We provide the proof here for the sake of completion and for 
ease of review. This will be removed later. Let h := x mo< i cs —x. 
We adapt the approach of iflOl to bound the reconstruction 
error, \\h\\ := \\x — x\\. A similar result was obtained in 11291 . 
Let Ai denote the set of indices of h with the |A| largest 
values outside of T U A, let A2 denote the indices of the next 
|A| largest values and so on. Then using the same approach 
as that of ifTol . 

1 



|/l(TUAuAi) c || 



< 



3>2 



\hA 3 \ 



< 



1(TUA) C 



(35) 



Since x mo dcs = x + h is the minimizer of (O and since both 
x and Xmodcs are feasible; and since x is supported on N C 
TU A, 



FA 1 



|arrc||i > \\(x + h) T c\\i 

> \\xa\\i - IIMIi 



l (TUA) c 



Thus. 



l (TUA)<= 



<\\h 



All 1 



Combining this with 

HfyruAuA! 



and using 



\Ha\ 



< \\h 



|A| 

< II^aI 



A 



<£iiv 

Next, since both x and x moc [ cs are feasible 

\\Ah\\ = \\A(x - i mo d cs )|| 

< || 2/ - Ax\\ + || 2/ - Ax 

In this proof, let 



(36) 

(37) 
we get 

(38) 



modes I 



< 2e 



5 = 5 



m+2|A| 



and 9 



(39) 



(40) 



Now, we upper bound H^tuAuaJI- To do that, notice that 

(I-^I^tuAuaJI 2 < H^tuauaJI 2 (41) 

To bound the RHS of the above, notice that AhruAuAx = 
Ah — X^->2 AtiAj and so 



\Ah 



TuAuAil 



(AhruAuA^Ah) 



J>2 



(Ah 



TUAuAi 



,Ah 



A,/ 



Using d39l and the definition of 5s given in (O, 



KAhruAuA^Ah)] < 2eVl + <5||/i T uAuA 1 || (42) 

Using the definition of 9g 1 $ 2 given in @; equation ( |38l ; and 
the fact that ||ft T || + ||/i AUAl || < \/2||^tuauAi||, we get the 
following. If 2|A| < |T|, 

\J2( Ah TVAUA 1 ,Ah A] )\ 

j>2 



< 



2|A|,A|||^-AUA 1 ||) \\ h &j 

< V^flHftruAuAjl ||^a|| 



(43) 



Combining the last four equations above, if 2|A| < |T|, 

(I-^H/ituAuaJI < 2eVTT5 + V20\\h A \\ (44) 
Using ||/ia|| < II^-tuauAiII, we can simplify the above to get 



II^TuAuaJI < : — € 



1-5-V20 

Finally, using d38l and U^aII < ||/ituauAiH an d the above 



(45) 



IHI < 211/iTUAuAj) < 4%/1 + l- £ 



(46) 



1-5- 

Clearly, all of the above discussion holds only if the RHS 
is positive which is true only if 5 + < 1. Also, d43t and 
hence everything after that needs 2|A| < |T|. Since |T| = 
\N\ + |A e | - |A|, this will hold if 3|A| < \N\. Thus, we get 
the following result. 

Corollary 5: If |A| < |JV|/3 and if £| T |+2|A| + 
V29\t\,\a\ < 1, then 



\h\\ < 



4VTT5 



1-5 



|T|+2|A| 



V20 



(47) 



|T|,|A| 



Using 0\ T \ t \A\ < $\T\+\A\ < <5|T|+2|A| in both the required 
sufficient condition and in the bound; and by substituting 
\T\ = \N\ + |A e | - |A|; and by using = V2 - 1 we get 

the notationally simpler result of Lemma \t\ 
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(d) n = 59, r = 2/5, d = 5 

Fig. 3. Normalized MSE (NMSE), normalized number of extras and normalized number of misses over time for modified-CS (mod-CS), 
modified-CS with add-LS-del (mod-CS-add-LS-del), LS-CS and simple CS. In all cases, NMSE for simple CS was more than 20% (plotted 
only in (a) and (b)). We cannot use a logarithmic y-axis for plotting support errors since in some cases the errors are exactly zero. 



